Notes on Nonnegative Tensor Factorization of the Spectrogram for Audio Source Separation: Statistical Insights and Towards Self-Clustering of the Spatial Cues
نویسندگان
چکیده
Nonnegative tensor factorization (NTF) of multichannel spectrograms under PARAFAC structure has recently been proposed by Fitzgerald et al as a mean of performing blind source separation (BSS) of multichannel audio data. In this paper we investigate the statistical source models implied by this approach. We show that it implicitly assumes a nonpoint-source model contrasting with usual BSS assumptions and we clarify the links between the measure of fit chosen for the NTF and the implied statistical distribution of the sources. While the original approach of Fitzgeral et al requires a posterior clustering of the spatial cues to group the NTF components into sources, we discuss means of performing the clustering within the factorization. In the results section we test the impact of the simplifying nonpoint-source assumption on underdetermined linear instantaneous mixtures of musical sources and discuss the limits of the approach for such mixtures.
منابع مشابه
Nonnegative Tensor Factorization with Frequency Modulation Cues for Blind Audio Source Separation
We present Vibrato Nonnegative Tensor Factorization, an algorithm for single-channel unsupervised audio source separation with an application to separating instrumental or vocal sources with nonstationary pitch from music recordings. Our approach extends Nonnegative Matrix Factorization for audio modeling by including local estimates of frequency modulation as cues in the separation. This permi...
متن کاملNonnegative Tensor Factorization for Directional Blind Audio Source Separation
We augment the nonnegative matrix factorization method for audio source separation with cues about directionality of sound propagation. This improves separation quality greatly and removes the need for training data, but doubles the computation.
متن کاملItakura-Saito Nonnegative Factorizations of the Power Spectrogram for Music Signal Decomposition
Nonnegative matrix factorization (NMF) is a popular linear regression technique in the fields of machine learning and signal/image processing. Much research about this topic has been driven by applications in audio. NMF has been for example applied with success to automatic music transcription and audio source separation, where the data is usually taken as the magnitude spectrogram of the sound...
متن کاملBeyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction
This paper presents a new fundamental technique for source separation of single-channel audio signals. Although nonnegative matrix factorization (NMF) has recently become very popular for music source separation, it deals only with the amplitude or power of the spectrogram of a given mixture signal and completely discards the phase. The component spectrograms are typically estimated using a Wie...
متن کاملNote Clustering Based on 2-D Source-Filter Modeling for Underdetermined Blind Source Separation
For blind source separation, the non-negative matrix factorization extracts single notes out of a mixture. These notes can be clustered to form the melodies played by a single instrument. A current approach for clustering utilizes a source filter model to describe the envelope over the first dimension of the spectrogram: the frequency-axis. The novelty of this paper is to extend this approach b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010